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Abstract 

We consider a decoder with an erasure option and a variable size list decoder 
for channels with non-casual side information at the transmitter. First, universally 
achievable error exponents are offered for decoding with an erasure option using a pa- 
rameterized decoder in the spirit of Csiszar and Korner's decoder. Then, the proposed 



> 



decoding rule is generalized by extending the range of its parameters to allow variable 
size list decoding. This extension gives a unified treatment for erasure/list decoding. 
Exponential bounds on the probability of list error and the average number of incor- 
rect messages on the list are given. Relations to Forney's and Csiszar and Korner's 
■ decoders for discrete memoryless channel are discussed. These results are obtained by 

exploring a random binning code with conditionally constant composition codewords 
£f) • proposed by Moulin and Wang, but with a different decoding rule. 

o 

1 Introduction 
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A decoder with an erasure option is a decoder which has the option of not deciding, i.e., 
to declare an "erasure" . On the other hand, a variable size list decoder is a decoder which 
produces a list of estimates for the correct message rather than a single estimate, where 
a list error occurs when the correct message is not on the list. In [1], Forney explored the 
random coding error exponents of erasure/list decoding for discrete memoryless channels 
(DMC's). These bounds were obtained by analyzing the optimal decoding rule [TJ eq. (11)] 

y € lZ m iff Pr(y, x m ) > e NT ^ Pr(y, x m >) (1) 
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where Pi(y,x m ) is the joint probability of the channel output y and the codeword x m , 
and T is an arbitrary parameter. The bounds were obtained using Gallager's bounding 
techniques. Forney showed that the list option and the erasure option are "two sides of 
the same coin" , namely, by changing the value of T one can switch from list decoding (T 
is negative) to decoding with an erasure option (T is positive). 

In [21 Th. 5.11], Csiszar and Korner derived universally achievable error exponents for 
a decoder with an erasure option for DMC's. These error exponents were obtained by 
analyzing the following universal decoding rule (2j p. 176] for constant composition (CC) 
codes: 



where R is the code rate (i.e., m € {1, . . . , 2^^}), x rn is a codeword taken from a given type 

class Tx, y is the channel output, I(x; y) is the empirical mutual information, and R > R 

and A > are arbitrary parameters. This decoding rule generalizes the maximum mutual 

information (MMI) decoder [21 p. 164] to include an erasure option. The bounds were 

obtained using a fixed composition coding and by applying the packing lemma derived in 

[21 Lemma 5.1]. However, these bounds were not extended to variable size list decoding. 

We note that the decoding rule ([2]) depends on the coding rate R, which might limit its 

generality. Moreover, it was stated that ([2]) is an unambiguous decoding rule for A > 0, 

a fact that was used to derive the error exponents. It turns out that this decoding rule 

is unambiguous only when A > 1. Unlike Forney's decoder (pQ), no optimality claims 

were made for this decoder but, in [3j Sec. 4.4.3] Teletar stated that these bounds are 

"essentially the same as those in [1]". 

Recently, Moulin [3j generalized Csiszar's decoder using a weighting function: 

t \ _ / m , I(x m ;y) > R + m&yL m ^ m F(l(x m r,y) - R) 
lf[V) \ , else 

where F(-) is a continuous, non-decreasing function. The corresponding error exponents 
were analyzed and it it was shown that for some rates and channels these error exponents 
coincide with Forney's error exponents. Note that Moulin's proposed decoder is a function 
of the code rate R similarly to Csiszar's decoder. 

In [S],[S] Teletar and Gallager proposed tighter exponential bounds on decoding with 
an erasure option and list decoding for DMC using the method of types. These bounds 








m 



I(x m ;y) > R + \\I{x m t\y) - R\ + VW / m 
else 



(2) 
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are not universal in general since the decoding metric depends on the channel statistics. 
However, it is claimed that under certain conditions these bounds are tighter than Forney's 
bounds. See © Sec.III]. 

As far as we know, no similar bounds were ever offered for discrete memoryless channels 
with random states, which are observed by the encoder but not by the decoder [7]. For 
ordinary decoding (without erasure/list option), Moulin and Wang |8j recently derived an 
achievable error exponent for channels with state information present non-causally at the 
transmitter. These results were obtained by analyzing the error probability of a stacked 
binning scheme and a maximum penalized mutual information (MPMI) decoder. 

In this work, we use the random code construction proposed by Moulin and Wang 
[S] to derive achievable error exponents for decoding with an erasure option and variable 
size list decoding. In Section [21 we propose a parameterized decoding rule with an erasure 
option in the spirit of ((2]). In Sectional we derive universally achievable error exponents by 
analyzing the proposed decoding rule. In Section [U achievable error exponents are offered 
to decoder with a list option. These exponents are obtained by extending the range of 
the proposed decoder's parameters to allow decoding with a list option. The generalized 
decoding rule enables a unified treatment for erasure/list decoding similar to Forney's 
decoder ([1]). In Section [5l relations to Forney's and Csiszar and Korner's decoders for 
DMC are discussed. Moreover, it is shown that the obtained error exponents generalize 
some known results. 

2 Notation and Preliminaries 

We begin with some notations and definitions. Throughout this work, capital letters 
represent scalar random variables (RVs), and specific realizations of them are denoted by 
the corresponding lowercase letters. Random vectors of dimension ./V will be denoted by 
bold- face letters. The notation 1{^4}, where A is an event, will designate the indicator 
function of A (le.,t{A} = 1 if A occurs and — otherwise). The notion ci n — 6 n , 

for two positive sequences {a ra } ra >i and {6 n } n >i, expresses asymptotic equality in the 
logarithmic scale, i.e., 
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Let the vector P x = {Px{a), a G X} denote the empirical distribution induced by a 
vector x G X n , where Pa; (a) = ^27=1 = a}. The type class T x is the set of vectors 
x G X n such that P x = Px- A type class induced by the empirical distribution P x will be 
denoted by T(P X ). Similarly, the joint empirical distribution induced by (x,y) G X n x y n 
is the vector P X y = j P X y (a, b), a G X, b G ^ j where 

1 n 

Pxy(a,b) = - H x i = a ,yi = b}, x £ X, y £y , 
n i=i 

i-e., Pxy(a,b) is the relative frequency of the pair (a, 5) along the pair sequence (x,y). 
Likewise, the type class is the set of all pairs (x, y) G X n x y n such that = Pxy- 
The conditional type class Ty\ x , for given vectors a; G -f", and y G y n is the set of all 
vectors y £ y n such that T X y = T X y. The Kullback-Leibler divergence between two 
distributions P and Q on .4, where |.4| < oo is defined as 



with the conventions that OlnO = 0, and pin ^ = oo if p > 0. We denote the empirical 
entropy of a vector cc G A' ra by H(x), where H(x) = — Ylaex ^x ip) In P x (a) . Other 
information theoretic quantities governed by empirical distributions (e.g., conditional em- 
pirical entropy, empirical mutual information) will be denoted similarly. Finally, we define 
= max{0,t} and exp 2 (t) = 2*. 

Consider a discrete memoryless state dependent channel with a finite input alphabet 
X , a finite state alphabet S, a finite output alphabet y, and a probability transition 
distribution W(y\x,s). Given an input sequence x and a state sequence s emitted from 
a discrete memoryless source Ps{s) = YliLi Ps( s i)> the channel output sequence y is 
generated according to the conditional distribution W(y\x,s) = YiiLi W(yi\xi, si). A 
message m G {1, . . . , M} is to be transmitted to the receiver. We assume that the state 
sequence s is available at the transmitter non-causally, but not at the receiver. We also 
assume that all messages are a-priori equiprobable. Given s and m, the transmitter 
produces a sequence x = f^(s,m) which is used to convey message m to the decoder. 
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2.1 Codebook construction [8] 

In [8, p. 1337], Moulin and Wang used in their derivation a binning code with conditionally 
constant composition (CCC) codewords. This code will be used in our proofs. For the 
sake of completeness, we briefly describe the code construction and the encoding process. 
The decoding part will be described in detail later. The code construction requires the 
use of an auxiliary random variable U € U which takes on values in a finite set of size 
\X\ \S\ + 1. See [H Sec. III.E] for more information. 

For a given empirical conditional distribution Pxu\s> a SUD -code C{Ps) is constructed 
for each state sequence type class T s = T(P S ). Given a state type class T(P S ), compute 
the marginal distribution 

x s 

where Ps is the empirical distribution induced Ts- Note that Pu( u ) is a function of 
Ps and it might be different for a different state type class. Draw 2 N ( R+p ( Ps ^ random 
vectors independently from the type class T^(Ps) induced by P^, according to uniform 
distribution, where p(-) is a general bin-depth function. Arrange the vectors in an array 
with M = 2 NR columns and 2 n p ( ~ p s) rows. The code C is the union of all sub-codes, i.e., 
C = Up s C(Ps)- Note that the number of these sub-codes is polynomial in N. In this 
work, we choose p(Ps) = lus(Ps) + 6 j where 



P s (s)P^ s (u\s) 



PuN 

i.e., Iif S (Ps) is the mutual information I(U ; S) induced by Ps(S) • P^ S (U\S), and e is an 
arbitrarily small positive constant. This choice ensures that the probability of encoding 
error vanishes at a double-exponentially rate [8, p. 1338]. 

The encoding of message m given a state sequence s is done in two steps: (i) Find an 
index I such that ui >m G C(Ps) is a member of the conditional type class , g = {u' : 
Pu's = P>> i tcPs}- If more than one such I exists, pick one at random under the uniform 
distribution. If no such I can be found, pick u at random from T^ g under the uniform 
distribution, (ii) Draw X uniformly from T^ U[ g , induced by Pxu\s an< ^ ( u i,m,s). For 
notational simplicity, we use the shorthand A to denote the type of state sequences P$, 
and Ux t i t m to denote u^ m G C(A). 
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In [8], a maximum penalized mutual information (MPMI) decoder was used to decode 
the above code. A MPMI decoder seeks a vector u E C that maximizes the penalized 
empirical mutual information criterion maxp max ug( ,^ ^ \I{u;y) — ip(Ps)], where tp(-) 
is a general penalty function. It was shown that the optimal choice of these functions is 
p(Ps) = ^{Ps) = ^us(^ s ) + e w here e is an arbitrarily small positive constant. In this 
work, we assume that tp(Ps) = Insi^s) = 1% P * (U',S) for reasons that will be given 

F S F u\S 

later. To allow decoding with an erasure/list option, we propose to modify the MPMI 
decoding rule in the spirit of ([2]). We choose p{Ps) = ip(Ps) = ^us(^) + 6 ^ or reasons 
that will be discussed in Section 

2.2 The proposed decoding rule 

For a given code C constructed as described in Subsection 12.11 we propose to use the 
following decoder ip : y N — > {0, 1, ... , M} with an erasure option: Declare m if 

I(u x ^ m ;y) - I* US (X) > T + a\l(uyj ltml ;y) - I* us (\')\ + Vm'^m,\',l', (3) 

otherwise, declare (i.e., "erasure"), where a > 1 and T > are arbitrary parameters. 

Our first step is to show that this decoder is unambiguous, i.e., at most one message 
index taken from {1, . . . , M} fulfills ©. This property is essential to allow decoding with 
an erasure option. This property is stated in the following Lemma: 

Lemma 1. For a > 1 and T > 0, the proposed decoding rule ([3]) is unambiguous. 

The proof of Lemma [T] is deferred to the Appendix. Using a similar proof of Lemma [H 
it can be shown that Csiszar and Korner's decoder ([2]) might be ambiguous if < A < 1, 
contrary to the statement made in [2j Th. 5.11]. 

3 Erasure Option 

Given a code C, a decoder with an erasure option is a partition of y N into (M + 1) regions 
7Zq, TZi, ... , TZm- The decoder decides in favor of message m if y G lZ m , m = 1, . . . , M, or 
it declares "erasure" if y £ IZq. Following Forney pQ, let us define two error events. The 
event £± is the event in which y does not fall in the decision region of the correct message. 
The event £2 is the event of undetected error, namely, the event in which y falls in 1Z m i , 
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ml ^ 0, while m was transmitted. The probabilities of these error events are given by 

1 M 

m=l yeTZg, 
M 

m=l y&TZm m'^m 

where P(y|x m ) = Eses^ p s(s)W A (y|a; m (s), s) . 

Let J(PsA/x|siV|xs) = /(^;^) " I(U;S) where P s , P ux \ s , and P y , X5 are three 
(conditional) probability distributions of the quadruplet RVs (U, S, X, Y). The following 
theorem presents exponential bounds on Pr{<?i} and Prj^} for decoding with erasure 
option: 

Theorem 1. For every a > 1 and T > there exists a N— length block code of rate R 
such that the following error exponents can be achieved simultaneously 

Pr{£x} < exp 2 {-JV^(lJ,W;r,a)} (6) 
Pr{£ 2 } < exp 2 {-NE 2 (R,W,T,a)} (7) 

where 

Ei(R, W, T, a) = min max min { min V{P s P ux \ s Py\xs\\PsPux\sW), 

Ps Pux\s 1 Py\xs-J{PsPux\sPy\xs)<t 



mm 

Py\xs 



^{PsPu x\sPy\xs\\PsPu x\sW) + — (J(PsPux\sPy\xs) 

-T)-r\ ]} (8) 



and 



E2(R, W, T, a) = min max min < 

Ps Pux\ S Py\xs 1 

V{P s P ux \ S Py\xs\\PsPux\sW) + \t + a\ J(P S Pux\ S Py\xs)\ + ~ r[) • (9) 



Proof. We analyze Pr{fi} and Pr{£2} using the proposed decoder ([3]). The proof is similar 
in some parts to the derivation done in the proof of Theorem 3.2 in [5], but it is given in 
full for the sake of completeness. 

Fix a probability distribution P U x\s an d construct a code C as described in Subsec- 
tion 12.11 An encoding error occurs when the first encoding step fails. Namely, given m 
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and s there is no index I such that u^,/,m € Since m is drawn randomly according 

to uniform distribution from T^(Ps) it follows that 

Pr{£ c (m)\s} = [l -Pr [U € T* |s |tf € T£(P S )}J (10) 
where £ c ( m ) denotes encoding error when message index m is encoded, and 

Pr(i/6T* lfi |t/er r *,rA)) uls - 



= 2- m *us( p s) . (11) 

Since p(Ps) was chosen to be greater than I^ S (P S ) by e we get that probability of encoding 
error of message m given s is upper bounded by 

Vx[8 c {m)\T{P s )] <exp{-2 JVe } , (12) 

namely, the probability of encoding error decays in a double-exponential rate. See step 1 
in [U p. 1338] for more details. 

The undetected error probability can be expressed as follows 

1 M 

Pr{£ 2 } = tt ^ >T {^2\fn is to be sent} 

m=l 



Pr{f 2 (l)| m = l} 

^(^wsajyl m = l) Pr |f 2 (l)|r wsa ;y, = lj 



T usxy 

< Yl p { T usxy\m = l) 
T usxy 



Pr |f c (l)|r US£C y,m = l| 



+ 



Pr{^ 2 (l)|T usa;y ,m= l,^ c (l) c } 



(13) 



where £ 2 (1) is the event of undetected error given that m = 1 was sent, T US xy is the 
joint typical class of the quadruplet (u, s,x,y). Since all messages are drawn according 
the same probability distributions, the probability of a type class T U sxy is independent 
of the message index 1 < m < M. Therefore, 

P(T U sxy | rn = 1) = exp 2 P(T US xy) 

= e W2 {-NV(P s P uxls Py lxs \\P s P uxls W)} , (14) 



S 



as was shown in [8, eq.(5.12)]. An undetected error can occur only if there is a ux r ,i',m' € C 
such that 

I{u X <,V,m>; V) - IusW >T + a\I(u; y) - P US {P S )\ + , (15) 

conditioned on u, s, y and T U sxy- 

Following [8, eq. (5.13)], the undetected error probability is upper bounded by 

r i _ r „ 1 2 NI US<-r S i\2 NR -l) 

Pr |£ 2 (1) \T usxy , m = 1, £ c (l) c j = 1 - JJ [l - P e2 («, y, P a ,, T usa;y )J , (16) 

where P e2 (u, y, Ps', T USX y) is the probability that for some V and w! ^ 1, tt;/ jm / € C(Ps') 
fulfills (fT5|) conditioned on u,y and T U sxy- Pe 2 (u,y, Ps 1 ->T U sxy) can be expressed as 
follows 

P e2 (u, y,P 8 ,, T usxy ) = P(u'\Ps>) 

u'&t e2 (u,y,P s ,,P U sxy) 

u'eu e2 (u,y,P s ,,P usxy ) 

where 

U e2 (u,y,P s ,,P usxy ) = [u 1 G Ift(JV) : 7(ti';y) - > T + - /£ 5 (P S )| + } . 

(17) 

Clearly, U e2 (u,y, P s i , Pusxy) is contained in the following set of conditional types 



% 2 (u,y,P s >, Pusxy) = | 



Tu'\y '■ Tu> — Ty(Ps'), 



I(u';y)-r us (P s ,)>T + a\l(u;y)-P us (P s )\ + } 
C {T u ,|y :I(u'-y)-P us (P s ,) > T + a\l(u;y) - P US (P S )\ + } 

Using similar steps as in [8l eq. (5.14)-(5.17)], we get 

Pe% (u,y,Ps>, Tusxy) 
< ^ \ T u'\y\ 



\Tu'\ 

T u >\y^ T e2( u >y, p si> p usxy) 

^ 2 -NI{U';y) 

T u'\y ^ Te 2 i u >y> p s' p usxy ) 



< exp 2 { - N [T + a\I(u; y) - I* US (P S )\ + + I* US (Ps>)} } ■ (18) 
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Applying the following bound [8j eq. (5.18)], which can be regarded as a generalized union 
bound, 



1 - J|(l - ai ) u < min jl, J^a^j 



< a, < > 1 



(19) 



on Eq, (fT8l) we get 



Pr{£ 2 (l)\T usxy ,£ c (l) c } = l-H{l-P e2 (u,y,P s , 



2 NI US (P S'\2 NR -1) 



< min < 



1, £ P e2 («, y, ^^(^(^ - 1) 

}• (20) 



< exp 2 {-Jv|r + a|J(«;y)-J^ s (P a )|+-J2 
Combining (fl3l) . (JHJ), (f20|) and optimizing over -f\ta;|s an d -Ps we get that 

Pr{£ 2 } < exp 2 {-NE 2 (R, W, T, a)} , 



(21) 



where £2(R,W,T,a) is given in ([9]). 

Similarly to derivation of E2(R,W,T,a), we can upper bound the probability of not 
making the right decision, denoted by Pr{£i}. This error event occurs when the received 
y does not belong to the decision region corresponding to the transmitted message m. 
Therefore, an error occurs when 



I(u; y) - iMPa) <T + a\I(u x , Am r,y) - J£ S (A' 



(22) 



for some A', I' and m! ^ m, conditioned on u,s,y and T U sxy- This happens if and only 
if 



I(u;y)-P US (X)<T, 



(23) 



or 



T < I(u; y) - I* us {\) <T + a|J(u A /,,/, m /,y) - I*us^')\ + - (24) 

Following (|2"3]h I(u; y) - I^ S {X) — T is strictly positive since T > and I(u; y) - ]J} S (A) 
is strictly greater than T. Moreover, (I24p implies that I(u\r y m ',y) — -^/s(A') mus t be 
positive too since 



r<r + a|J( WAW ,y)-J^(A')| + 



a|/(t*A',z',m',y)-^s(A')r >0, o>l 
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which means that the clipping function | • | + was not active, namely, I(uyj^ m /,y) 
Iu S (X') > 0. Therefore ([MI) implies that 



J(tt; y) - I* us {Ps) ~ T < I{u x , iV>m ,;y) - P us (\') . (25) 



Hence, the event of not making the right decision is a union of two disjoint events ([23 
and ([25]). Therefore, 



1 M 

Pr{£i} = — Pr|^i|m is to be sentj 



m=l 

Pr j£i(l)|m = l} 



= Y P { T usxy)Pr{£ 1 (l)\T usxy ,m= l} 
T usxy 

< Y P{ T usxy)[Pr{£c{l)\T USX y,m = l} + 1 {l \u;y) - I^ s (P a ) <T} 
T usxy 

+ Pi{A(l)\T U sxy,m = l,£c{l) c }] , (26) 

where £i(l) is the event of making the wrong decision given that m = 1, and .4.(1) is the 
event in which T < I(u; y) — J^- 5 (Ps) < T + a[I(u\i ^ ^ m i;y) — ^ys(A')] for some A', /' and 
m' ^ 1. The last sum can be rewritten as follows 

Y P { T usxy\m = l)[Pi{£ c (l)\T usxy ,m = l} + Pt {A(l)\T usxy , m = l,£ c (l) c } 
T usxy 

+ Y P { T usxy\m = l)l{l(u;y)-r us (P s )<T} . 
T usxy 

(27) 



The second summand of (I27p can easily be estimated using the method of types and by 
applying (fTI|) 

Y P {Tusxy \ m = l)l{l(tt; y) - /£ 5 (P S ) < T} = 

= ^ P {Tusxy) 

TuSXy r.I(U;y)-I* us {P s )<T 

= exp 2 J -iV _ min V{P s P ux \ s P y \ xs \\P s P ux \ s W) \ (28) 

[ PusxyT(U;y)-i* us (P s )<T J 
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As for the first summand of (I26p . it can be upper bounded similarly to the undetected 
error probability in the following way 



Pr {.4(1) \Tusxy , m = 1, £ c (l) c } = 1 - J] [l - P ei (ti, y, P S ',T U8xy ] 



2 NI US ( - P S' ) (2 NR -l) 



(29) 



where P ei (u, y, P s > , T U sxy) is the probability that Uii >m i E C{P S > ) fulfills ([23]) conditioned 
on it, y and T U sxy for some /' and to' 7^ 1. Therefore, 



P ei (u,y,P S ,,T USX y) = Yl P(V'\PS>) 

u'eU B1 (u,y,P s ,,P usxy ) 
u'eu ei (u,y,P s ,,P usxy ) 



(30) 



and 



U ei (u,y,P s /,Pusxy) = 

{u'eTZ(P s ,) :I{u l ;y)-I{j S {P s ,)>^[I{u;y)-P us (P s )-T}} . (31) 

Again, U ei (u,y, P S ' , Pusxy) is contained in the following set of conditional types 

T ei (u,y,P S ',P U sxy) = 



Pu'\y '■ Tu' — Tu\Ps' 1 



J(u'; y) - Fuses') > - [/(«; y) - ^s(Ps) - r] } 

1 

a 



C \T u ,\ y ■I{u'-y)-P us {P sl )>^[l{u-y)-r us {P s )-T]) 
Using similar steps as in the first part of the proof, we get that 
P ei (u,y,P s >,T U sxy) 



< 



\ T u'\y\ 
\Tu'\ 



T u , ]y cT ei (u,y ,p 8 , , Pusxy) 

= 2 - NI{u '^ ) 

T u' ly^ei^'V > p s' ' P usxy) 
< exp 2 { - JV[1 (J(«; y) - /£ 5 (P S ) - T) + /£ 5 (P S ,] } . 

Applying the union bound (|19|) . we get 

r 1 2 JVJ ^^s' ) (2 JVii -l) 

Pr{^(l)|T usa;y ,^(l) c } = l-n[l-P ei («,!/,^W 



(32) 



< exp 2 {-iv|i[/( U ;y)-/^(P s )-r] -P| + }. (33) 
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Combining (j2"fi|) . pSj) . HI]),®, and optimizing over Pux\s and -Ps we get that 



Pr{£i} < exp 2 {-NE 1 (R,W,T,a)} , (34) 
where E\(R, W, T, a) is given in (JSj) - 

□ 

4 List Decoding 

A decoder with a variable size list produces a list of candidate estimates for the correct 
message. Let 6 \ denote the list error event, namely, the event in which y does not fall in the 
decision region corresponding to the correct message. As stated by Forney [TJ p. 206], the 
event corresponding to E 2 under decoding with an erasure option, is the average number 
of incorrect messages on the list, denoted by Nj, where 

Nj = Pr-{W is on the list and incorrect} . (35) 

m! 

In the following Theorem, exponential bounds are offered for the probability of E\ and 
on the average number of incorrect messages on the list. These bounds are obtained by 
generalizing the proposed decoding rule ([3]) to the variable list size case by extending the 
range of its parameters a and T. 

Theorem 2. For every a 6 (0, 1) and T E IR there exists a N— length block code of rate 
R such that the following error exponents can be achieved simultaneously 

Pr{^} < exp 2 {-Ar^ 1 ( J R,^,T,Q)} (36) 
Nj < e W2 {-NE 2 (R,W,T,a)} , (37) 

where 



Ex(R, W, T, a) = min max min <^ min V{PsP U x\s p Y\xs\\ p sPux\s w )^ 

Ps Pux\s \Py\xs-J{PsPux\ S Py\xs)<T 



min (V(P Ylxs \\W\PsP ux \s) + \^[J{PsPux\ S Py\xs) -T]- r\ + ) \ , (38) 



Y\XS 



13 



and 



E 2 (R,W,T,a) = min max min \v(P S P uxls P Ylxs \\PsP UX \sW) 

Ps PuX\S P Y\XS \ 

+ \T + a\J(PsP ux \sPY\xs)\ + \ + ~ r\ ■ (39) 

Proof. To allow list option we take a £ (0, 1) and T E TR. Therefore, the following 
decoding rule will be used: add m to the list if 

I(u x ,i, m ;y)-IusW >T + a\I(ux Am r,y)-Ifr s (\ , )\+ 

for all X',l',m' ^ m. An empty list is regarded as "erasure". 

Fix a probability distribution Pux\s an d construct a code C as described in Subsec- 
tion [2J3 The encoding error is described in (I12p . Let us start with the probability of list 
error. A list error occurs when 

I(u; y) - I* US (P S ) <T + a\ I(tiy,, W ; y) - ^(A')| + 
for some I', X' and m! ^ m, conditioned on u, s, y and T U sxy- This happens if 

I(u;y)-I* US (\)<T 

or 

T < /(«; y) - P us {\) <T + a[I(u y , w ,y) - P US (X')] , 

similarly to the second part of the proof of Theorem [TJ From this point on we follow the 
derivation of Pr{fi} in the proof of Theorem Q3 and obtain the desired exponent (|38p . 

Our next step is to upper bound the average number of incorrect words on the list Nj. 
Following pQ eq. (12)-(13)], the average number of incorrect codewords is 

M 



Nj = — ^2 P r | m ' is on list | m was sent 

m=l m'^m 

Pr |m' is on list] m = l| 



m'>l 



(M — 1) ^2 P( T usxy\ m = l)Pr jm' ^ mis on list | m = l,T U8X y\ 
T usxy 



(40) 
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where the second equality is because the messages are equiprobable. The probability that 
m' > 1 is on the list given that m = 1 was sent can be bounded as follows 

Pr jm' is on list] m = 1 j < Pr |<? c (l)|T ws;C y , m = 1 j 

+ Pr jm' / m is on list | T usxy , m = 1, £ c (l) c } (41) 

where £ c (l) is the encoding error event of message m = 1. Applying (I14D we get 

< M Yl p (Tusxy)[Pr{£ c {l)\T usxy ,m = l} 
T usxy 

+ Pr jm/ 7^ mis on list\T usxy ,m = l,£' c (l) c | . (42) 

The probability that m' ^ 1 is on the decoding list given that m = 1 was sent success- 
fully and given T U sxy is upper bounded by 

Pr jm' / m is on list | T USX y , m = 1, £ c (l) c j = 

l-H[l-Pe 2 (u,y,P S ',Tusxy)\ , (43) 

p s' 

where P e2 (u, y, P 8 i ,T U sxy) is the probability that there exist ui^ m i € C{P S >) which obeys 
the decoding rule conditioned on u, y and T U sxy f° r some Namely, there exist a code- 
word ui^ m i which "beats" all other codewords from a different column. This probability is 
upper bounded by the event in which uy m i € C{P S >) "beats" only the correct codeword, 
i.e., 

I(u x , tl ,, n ,;y)-I* s (P s ,) >T + a \l(u;y)-P us (X)\ + , 

which in turn upper bounds (|43|) . Therefore, 

From this point on we follow the derivation of Prj^} hi the proof of Theorem 1. Note 
that M multiplies Pr jm' ^ m is on list\T US xy , fn = l,£ c (l) c j in ([4"0|) , Therefore, the 



coding rate R is found outside the clipping function | • | + in (139]) unlike ([9]). This implies 
that Nj might be greater than unity as expected. 

□ 

5 Discussion 

In this paper, we proposed universally achievable error exponents for decoding with an era- 
sure option and a variable size list. These results were obtained by examining a universal 
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decoder with an erasure option, inspired by Csiszar and Korner's [2j p. 176] for DMCs. By 
changing the decoder's parameters, one can switch from list decoding (T £ IR, a € (0, 1)) 
to decoding with an erasure option (T > 0, a > 1). A similar behavior was exemplified 
by Forney pQ with the optimal decoding rule for DMCs. The proposed decoder ([3]) has a 
similar structure to Csiszar and Korner's decoder ^fy, however, it does not depend on the 
coding rate R which make it more general. 

Setting specific values to a and T achieves some known results. If we take a = 1 
and T = in Theorem H we get that E x (R,W,T,a) = E 2 {R,W,T,a) = E(R,W), where 
E(R, W) is the exponent achieved in [HI Th. 3.2] for a known channel. 

In [TJ eq.(lla)], Forney proposed a suboptimal decoding rule with an erasure option 
in which the decoder declares "m" if 



Pr{y,x m ) > e NT Pv{y,x m2 ) , (44) 



otherwise, an "erasure" is declared, where Pr(y,£c m2 ) is the probability of the second 
most likely code word, and T is a positive parameter. Hence, the probability of the most 
likely code word must be at least e NT times higher than the probability of any other code 
word given y. If we think of maxi t \[I(u\ t ^ m ; y) — I^g(X)] as the normalized logarithm 
of the empirical generalized a-post priori probability of message m given y, as stated in 
[5J p. 1332], then by setting a = 1 and T > in §3$), we obtain an empirical version of 
Forney's suboptimal decoding rule (|4"4"|h 

Unlike [8], we fixed the penalty function ip and the bin-depth function p beforehand. 
Clearly, taking p(Ps) = Ijjs(Ps) 1S an optimal choice since it is the lowest exponential 
rate which ensures a vanishing encoding error probability. Higher values of p increases the 
probability of decoding error (see [SI p. 1331]). However, it is not clear whether ip = p is 
an optimal choice in ([3]) (at least not when a = 1). If, for example, we derive the exponent 
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E\ in Th. 1 with a general penalty function ifj(-) we get that 
Ei (R, W,T,a,tp) = max min max min < 

V> P S P C ~,r. L 



C/X|S 



Py\xs--I(Y;U)-^(Ps)<T 



mm 



+ 



V(P Y \ xs \\W\PsP ux \s) 
-(I(Y; U) - ip(P s ) - T) + mh#(P s ) - I(U; S)]-R + ]\, 



a Ps 
where I(J7; 5) is the mutual information induced by Ps(S)P u \g(U\S). As one can see, tp is 
involved in many places of the above expression and therefore cannot be easily optimized. 
Moreover, the argument used in [8] to prove that ip = p is optimal cannot be applied here. 
This calls for further investigation. 

We note that Forney's derivation cannot be easily applied to state dependent channels 
where side information is present at the transmitter. The main difficulty arises from the 
fact that the overall channel from u to y is not memory less. 

6 Appendix 

Proof of Lemma\]\ Denote J(u X j tm ; y) = J(uA,j, m ; y) - Ius(X). Suppose that the lemma 
is false. Therefore, there are two vectors UA,z,m an d u x f ~ such that: 

J{u\,i, m ;y) > T + a\J(u X i, v , m <;y)\ + Vm' ^ m,V X\V , (45) 

J{u~ x iA -y) > T + u\J{u x , )ll)m ,-y)\ + Vm' ± m,V A' ,l' . (46) 

Hence, 

J(u x> i, m ;y) > T + a\J(u xlih ;y)\ + (47) 

J{u XXih ;y) > T + a\J(u x ^ m ;y)\ + . (48) 

From flUD-dlED i1; is clear tnat 

T + a [J(u Xt i >m ; y)] < J(« A ,z>; y ) < ^ [J{u x ^ m ; y)-T], (49) 
which implies that 

aT + a 2 J(u x ^ m ;y) < J(u Xi i >m ;y) -T 

< {l-a 2 )J{u Xyhm -y)-T{l + a) . (50) 
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Clearly, the right hand side of (I50p cannot be positive since 1 — a 2 < 0, T(l + a) > 
and J(u\j tm ;y) is positive following (f4"5"|) . Hence, the assumption that the decoding rule 
is ambiguous is wrong. Note that the same proof can be used to show that for A > 1, 
Csiszar and Korner's decoder [2J Th. 5.11] is unambiguous. □ 
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